Causal meets Submodular: Subset Selection with Directed Information

نویسندگان

  • Yuxun Zhou
  • Costas J. Spanos
چکیده

We study causal subset selection with Directed Information as the measure of prediction causality. Two typical tasks, causal sensor placement and covariate selection, are correspondingly formulated into cardinality constrained directed information maximizations. To attack the NP-hard problems, we show that the first problem is submodular while not necessarily monotonic. And the second one is “nearly” submodular. To substantiate the idea of approximate submodularity, we introduce a novel quantity, namely submodularity index (SmI), for general set functions. Moreover, we show that based on SmI, greedy algorithm has performance guarantee for the maximization of possibly non-monotonic and non-submodular functions, justifying its usage for a much broader class of problems. We evaluate the theoretical results with several case studies, and also illustrate the application of the subset selection to causal structure learning.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mental Arithmetic Task Recognition Using Effective Connectivity and Hierarchical Feature Selection From EEG Signals

Introduction: Mental arithmetic analysis based on Electroencephalogram (EEG) signal for monitoring the state of the user’s brain functioning can be helpful for understanding some psychological disorders such as attention deficit hyperactivity disorder, autism spectrum disorder, or dyscalculia where the difficulty in learning or understanding the arithmetic exists. Most mental arithmetic recogni...

متن کامل

Causal Markov Condition for Submodular Information Measures

The causal Markov condition (CMC) is a postulate that links observations to causality. It describes the conditional independences among the observations that are entailed by a causal hypothesis in terms of a directed acyclic graph. In the conventional setting, the observations are random variables and the independence is a statistical one, i.e., the information content of observations is measur...

متن کامل

Submodular meets Spectral: Greedy Algorithms for Subset Selection, Sparse Approximation and Dictionary Selection

We study the problem of selecting a subset of k random variables from a large set, in order to obtain the best linear prediction of another variable of interest. This problem can be viewed in the context of both feature selection and sparse approximation. We analyze the performance of widely used greedy heuristics, using insights from the maximization of submodular functions and spectral analys...

متن کامل

Subset Selection of Search Heuristics

Constructing a strong heuristic function is a central problem in heuristic search. A common approach is to combine a number of heuristics by maximizing over the values from each. If a limit is placed on this number, then a subset selection problem arises. We treat this as an optimization problem, and proceed by translating a natural loss function into a submodular and monotonic utility function...

متن کامل

Submodularity in Data Subset Selection and Active Learning: Extended Version

We study the problem of selecting a subset of big data to train a classifier while incurring minimal performance loss. We show the connection of submodularity to the data likelihood functions for Naı̈ve Bayes (NB) and Nearest Neighbor (NN) classifiers, and formulate the data subset selection problems for these classifiers as constrained submodular maximization. Furthermore, we apply this framewo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016